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RELATIVE RANGE 
CAMERA CALIBRATION 

by 

10 John C. Krumm 

BACKGROUND OF THE INVENTION 

1. Field of the invention 

The present invention relates in general to range imaging systems 
1 5 and more particularly to a method and a system for measuring a relative 
position and orientation of range cameras using a movement of an object 
within a scene. 



S 2. Related Art 

1! 20 Range imaging systems are used in a variety of applications to 

S determine the three-dimensional (3-D) characteristics of a scene (a scene 
is an environment of interest). By way of example, these applications 
include 3-D scene reconstruction, 3-D object recognition, robot navigation, 
terrain mapping and object tracking. An important component of a range 
25 imaging system is a range camera. A range camera is a device that is 
used to measure a 3-D structure of a scene by providing range (or depth) 
information as measured from a plane on the camera. Thus, while a black 
and white camera provides a grayscale intensity of each pixel and a color 
camera provides a color of each pixel, a range camera provides a range 
30 (or distance to the 3-D scene) of each pixel. Range cameras use a variety 
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of techniques to measure range including lasers, projected light patterns 

and stereo vision. 

For some applications (such as tracking persons within a scene) the 
range imaging system may include more than one range camera because 
5 a single range camera may not have a sufficiently large field of view to 
monitor the entire scene. In order for multiple range cameras to work 
together, however, the cameras must be calibrated to determine a position 
and an orientation of each camera relative to one of the cameras (known 
as a relative pose). This calibration of multiple cameras enables the 
10 ranging system to convert 3-D measurements obtained from each camera 
« into a common coordinate frame. For example, a path of a person in a 
S scene may be measured by each camera in its local coordinate frame and 
=f converted to a common coordinate frame (such as a room-based 

hi ■ 

m coordinate system). 

S 1 5 Several types of manual calibration techniques are used to calibrate 

n the range cameras. One type of calibration technique uses a three- 
f dimensional calibration chart to determine the relative position of each 
55 camera. This technique, however, is difficult to use and time-consuming 
5 because it requires that the calibration chart be positioned correctly within 
20 a scene. 

Another type of calibration technique requires a user to monitor a 
scene and determine a plurality of reference points in the scene until the 
relative position of each camera can be determined. For example, a user 
references a number of common points in a scene (within each camera's 
25 field of view) and, if enough of these common points are found, the relative 
pose of the cameras may be determined. One disadvantage of this 
technique, however, is that it is difficult to implement in a consumer-based 
product because it is unlikely the consumer would want to perform such a 
complicated and time-consuming calibration process. Moreover, with both 
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types of calibration techniques, if the consumer performed the calibration 
process improperly any results obtained from the range imaging system 
would be erroneous. 

Accordingly, there exists a need for a range camera calibration 
method and system that is accurate and simple to use. Whatever the 
merits of the above-mentioned systems and methods, they do not achieve 
the benefits of the present invention. 

SUMMARY OF THE INVENTION 

To overcome the limitations in the prior art as described above and 
other limitations that will become apparent upon reading and 
understanding the present specification, the present invention includes a 
method and system for determining a relative position and orientation of a 
plurality of range cameras using spatial movement. In particular, a path of 
an object is measured by each range camera in the camera's local 
coordinate frame. Thus, the path of the object is observed by each 
camera but, because each camera has a different view of the object's 
path, the object path is reported by each camera in different local 
coordinate frames. 

The present invention determines the relative location of each range 
camera by converting the object path as measured in each of the local 
coordinate frames to a common coordinate frame. The common 
coordinate frame may be, for example, with respect to one of the cameras 
or with respect to the scene (such as a room-based coordinate system). 

In general, the novel method of the present invention includes 
measuring a path of an object in a scene as observed by each camera, 
performing matching of points of the path and obtaining transformation 
parameters (such as an offset distance (Ax, Ay) and a rotation angle (6)), 
preferably by solving a system of transformation equations. These 
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transformation parameters are used to determine the relative position of 
each camera. Moreover, the present invention includes other novel 
features such a data synchronization feature that uses a time shift between 
cameras to obtain the transformation parameters. In addition, the present 
5 invention includes a unique process that improves the robustness and 
accuracy of solving the system of transformation equations by using a 
process that is less sensitive to outlying points. For example, in a 
preferred implementation the present invention includes using a least 
median of squares technique to reduce the sensitivity of the solution to 
1 0 points extremely removed from the correct solution. The present invention 
S3 also includes an interpolation process that interpolates between sampled 
1 points if there is no data at a particular instant in time. Further, the present 
§ invention includes a system for determining a relative position and 
I orientation of range cameras using spatial movement that incorporates the 
3 15 method of the present invention. 

m Other aspects and advantages of the present invention as well as a 

f more complete understanding thereof will become apparent from the 
| following detailed description, taken in conjunction with the accompanying 
I drawings, illustrating by way of example the principles of the invention. 
20 Moreover, it is intended that the scope of the invention be limited by the 

claims and not by the preceding summary or the following detailed 

description. 

BRIEF DESCRIPTION OF THE DRAWINGS 

25 The present invention can be further understood by reference to the 

following description and attached drawings that illustrate the preferred 
embodiments. Other features and advantages will be apparent from the 
following detailed description of the invention, taken in conjunction with the 
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accompanying drawings, which illustrate, by way of example, the principles 
of the present invention. 

Referring now to the drawings in which like reference numbers 
represent corresponding parts throughout: 

FIG. 1 is a block diagram illustrating an apparatus for carrying out 
the present invention. 

FIG. 2 is an overall block diagram of a range imaging system 
incorporating the present invention. 

FIG. 3 is a general block diagram of the object tracker of the present 

invention. 

FIG. 4 is a block diagram illustrating the calibration module of the 
object tracker shown in FIG. 3. 

FIG. 5 is a block diagram illustrating the transformation processor 
the calibration module shown in FIG. 4. 

FIG. 6 is a general flow diagram of the operation of the calibration 
module of the present invention. 

FIG. 7A-7C are general block diagrams illustrating exemplary 
operations of the calibration module shown in FIG. 4. 

FIG. 8 is a detailed flow diagram illustrating a preferred embodiment 
of the present invention. 

FIGS. 9A-9D illustrate an example of how the present invention can 

perform data matching. 

FIG. 10 illustrates an example of how the present invention can 
determine an accurate time offset value. 

DETAILED DESCRIPTION OF THE INVENTION 

In the following description of the invention, reference is made to the 
accompanying drawings, which form a part thereof, and in which is shown 
by way of illustration a specific example whereby the invention may be 
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practiced. It is to be understood that other embodiments may be utilized 
and structural changes may be made without departing from the scope of 
the present invention. 

5 I. Exemplary Operating Environment 

FIG. 1 and the following discussion are intended to provide a brief, 
general description of a suitable computing environment in which the 
invention may be implemented. Although not required, the invention will 
be described in the general context of computer-executable instructions, 
10 such as program modules, being executed by a computer. Generally, 
m program modules include routines, programs, objects, components, data 
§ structures, etc. that perform particular tasks or implement particular 
S abstract data types. Moreover, those skilled in the art will appreciate that 
K the invention may be practiced with a variety of computer system 
|{ 15 configurations, including personal computers, server computers, hand-held 
« devices, multiprocessor systems, microprocessor-based or programmable 
1 consumer electronics, network PCs, minicomputers, mainframe 
JS computers, and the like. The invention may also be practiced in distributed 
8 computing environments where tasks are performed by remote processing 
20 devices that are linked through a communications network. In a distributed 
computing environment, program modules may be located on both local 
and remote computer storage media including memory storage devices. 

With reference to FIG. 1, an exemplary system for implementing the 
invention includes a general-purpose computing device in the form of a 
25 conventional personal computer 100, including a processing unit 102, a 
system memory 104, and a system bus 106 that couples various system 
components including the system memory 104 to the processing unit 102. 
The system bus 106 may be any of several types of bus structures 
including a memory bus or memory controller, a peripheral bus, and a local 
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bus using any of a variety of bus architectures. The system memory 
includes read only memory (ROM) 1 10 and random access memory 
(RAM) 1 12. A basic input/output system (BIOS) 1 14, containing the basic 
routines that help to transfer information between elements within the 
5 personal computer 1 00, such as during start-up, is stored in ROM 1 1 0. 
The personal computer 1 00 further includes a hard disk drive 1 1 6 for 
reading from and writing to a hard disk, not shown, a magnetic disk drive 
118 for reading from or writing to a removable magnetic disk 120, and an 
optical disk drive 122 for reading from or writing to a removable optical disk 
10 124 such as a CD-ROM or other optical media. The hard disk drive 116, 

0 magnetic disk drive 1 28 and optical disk drive 1 22 are connected to the 
| system bus 1 06 by a hard disk drive interface 1 26, a magnetic disk drive 
^ interface 128 and an optical disk drive interface 130, respectively. The 
?*\ drives and their associated computer-readable media provide nonvolatile 
H 1 5 storage of computer readable instructions, data structures, program 

Ci modules and other data for the personal computer 1 00. 

f: Although the exemplary environment described herein employs a 

1 hard disk, a removable magnetic disk 120 and a removable optical disk 
S 1 24, it should be appreciated by those skilled in the art that other types of 

20 computer readable media that can store data that is accessible by a 

computer, such as magnetic cassettes, flash memory cards, digital video 
disks, Bernoulli cartridges, random access memories (RAMs), read-only 
memories (ROMs), and the like, may also be used in the exemplary 
operating environment. 

25 A number of program modules may be stored on the hard disk, 

magnetic disk 120, optical disk 124, ROM 110 or RAM 112, including an 
operating system 132, one or more application programs 134, other 
program modules 136 and program data 138. A user (not shown) may 
enter commands and information into the personal computer 100 through 
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input devices such as a keyboard 140 and a pointing device 142. In 
addition, a camera 143 (or other types of imaging devices) may be 
connected to the personal computer 100 as well as other input devices 
(not shown) including, for example, a microphone, joystick, game pad, 
5 satellite dish, scanner, or the like. These other input devices are often 
connected to the processing unit 102 through a serial port interface 144 
that is coupled to the system bus 106, but may be connected by other 
interfaces, such as a parallel port, a game port or a universal serial bus 
(USB). A monitor 146 or other type of display device is also connected to 
10 the system bus 106 via an interface, such as a video adapter 148. In 
addition to the monitor 146, personal computers typically include other 
3 peripheral output devices (not shown), such as speakers and printers, 
j. The personal computer 1 00 may operate in a networked 

environment using logical connections to one or more remote computers, 
W 15 such as a remote computer 150. The remote computer 150 may be 
= another personal computer, a server, a router, a network PC, a peer 

J device or other common network node, and typically includes many or all 
B of the elements described above relative to the personal computer 1 00, 
5 although only a memory storage device 1 52 has been illustrated in FIG. 1 . 
20 The logical connections depicted in FIG. 1 include a local area network 
(LAN) 154 and a wide area network (WAN) 156. Such networking 
environments are commonplace in offices, enterprise-wide computer 
networks, intranets and the Internet. 

When used in a LAN networking environment, the personal 
25 computer 1 00 is connected to the local network 1 54 through a network 
interface or adapter 158. When used in a WAN networking environment, 
the personal computer 100 typically includes a modem 160 or other means 
for establishing communications over the wide area network 156, such as 
the Internet. The modem 160, which may be internal or external, is 
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connected to the system bus 106 via the serial port interface 144. In a 
networked environment, program modules depicted relative to the personal 
computer 100, or portions thereof, may be stored in the remote memory 
storage device 152. It will be appreciated that the network connections 
5 shown are exemplary and other means of establishing a communications 
link between the computers may be used. 



II. Introduction 

The method and system of the present invention include measuring 
10 the relative position and orientation of at least two range cameras. Range 

0 cameras, which are used to measure the 3-D structure of a scene, give the 

1 range (or depth) of each pixel. In order for two or more range cameras to 
I: work properly together, the system (such as a range imaging system) 

Si using the range cameras must be able to determine a relative position and 

jj 1 5 orientation of each camera. This process of determining a relative pose of 

each camera (also known as calibration) enables the system to convert 3- 

=1 D measurements from each camera into a common coordinate frame. 
O . , 

il Data from each camera is in the camera's local coordinate frame, and 

n calibration of each camera makes the 3-D measurements from different 

20 cameras (in different local coordinate frames) consistent with each other. 

The present invention measures a relative pose between a plurality 

of range cameras by measuring a relative pose between two cameras at a 

time. One camera is designated as a base camera and relative poses of 

the remainder of the cameras can be measured relative to the base 

25 camera. In general, the present invention calibrates range cameras based 

on a path of an object around a scene. The object path is determined in a 

ground plane (such as a floor of a room) as a function of time as measured 

by a range camera. The present invention determines the transformation 

parameters that take a point on the object path measured by a non-base 
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camera and convert it to a point as it would be seen from the base camera. 
In addition, the present invention includes synchronizing data obtained 
from each camera, interpolating between sampled data points and using a 
robust error minimization technique to determine the transformation 
5 parameters. 

III. General Overview 

As shown in FIGS. 2-10 for the purposes of illustration, the invention 
is embodied in a method and a system for measuring a relative position 
10 and orientation of range cameras using a movement of an object within a 
scene. FIG. 2 is an overall block diagram of a range imaging system 
| incorporating the present invention. The range imaging system illustrated 
1 is only one example of several systems that could incorporate the relative 
W range camera calibration method and system of the present invention. In 
W 1 5 general, the range imaging system 200 includes a first camera 208 and a 
f second camera 216. Each of the cameras 208, 216 may use any of the 
? various techniques available to measure range, such as, for example, 
| lasers, projected light patterns and stereo vision. Both of the cameras 208, 
O 216 are directed toward a scene 224 and are capable of measuring a 3-D 
20 structure of the scene 224. 

The range imaging system also includes a first data module 232 that 
samples raw position data from the first camera 208 and a second data 
module 236 that samples raw position data from the second camera 216. 
These data modules 232, 236 may be, for example, computers or 
25 microprocessors. The first camera 208 supplies position data about the 
scene 224 in a first local coordinate frame and the second camera 216 
supplies position data about the scene 224 in a second local coordinate 
frame. These two local coordinate frames generally are not the same, and 
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calibration of the two cameras 208, 216 is necessary to express the 
position data from each camera in a common coordinate frame. 

The sampled data from each camera is sent to an object tracker 
240, which inputs the sampled data, calibrates the cameras 208, 216 and 
5 performs a coordinate transformation of the data. Further, an output 
module 248 is included in the range imaging system 200 that outputs 
scene data in a common coordinate system (such as a room-based 
coordinate system). In this example, the scene 224 includes a room 256 
containing a first sofa 264 on one side of the room 256 and a second sofa 
10 272 opposite the first sofa 264. In addition, a chair 280 is situated between 
to sofas 264, 272. 

i In this range imaging system, calibration of the range cameras 208, 

|} 216 generally is performed by having a person 288 (denoted by an "X") 
m move in a path 296 around the room 256. This path 296 is observed by 
| 15 the cameras 208, 216 in their respective local coordinate frames and the 
= raw position data (such as (x,y) coordinates) of the path 296 is sampled by 

1 the data modules 232, 236. The data modules 232, 236 sample raw 
jjj position data from each camera that includes the object path 296 
S described in a first local coordinate frame (as observed by the first camera 
20 208) and the object path 296 described in a second local coordinate frame 
(as observed by the second camera 216). 

The object tracker 240 receives the sampled data from the data 
modules 232, 236 and, using the present invention, calibrates cameras 
208, 216 by determining the relative position and orientation of each 
25 camera. Once the cameras 208, 21 6 are calibrated any data from the 
cameras 208, 216 is converted into a common coordinate frame. This 
means, for example, a path of an object around the room 256 is expressed 
by the object tracker 240 in a common coordinate frame. The object 
tracker 240 sends data in a common coordinate frame to the output 
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module 248, for output from the range imaging system 200. Further, the 
range imaging system 200 may transmit the data to a post-processing 
module 298 that may include, for example, a three-dimensional (3-D) 
scene reconstruction system, a 3-D object recognition system or a 3-D 
tracking system (which may be part of a vision-based computer interface 
system). 

IV. Component Overview 

FIG. 3 is a general block diagram of the object tracker 300 (the 
object tracker 240 in FIG. 2 is one example of the object tracker 300) of the 
present invention. In general, position data from cameras (box 310) in local 
coordinate frames is received by the object tracker 300, processed and 
data is sent as output in a common coordinate frame (box 320). The 
object tracker 300 includes a calibration module 330, which determines 
transformation parameters that will transform position data in local 
coordinate frames into a common coordinate frame, and a coordinate 
processor 340, which uses the transformation parameters computed by the 
calibration module 330 to transform data observed by the cameras into a 
desired common coordinate frame. 

FIG. 4 is a block diagram illustrating the calibration module 330 of 
the object tracker 300 shown in FIG. 3. The calibration module determines 
transformation parameters that are used to convert data in a local 
coordinate frame of each camera into a common coordinate frame. In 
general, data from each camera is received as input (box 410) and a data 
synchronizer 420 is used to synchronize the data received from multiple 
cameras. A coordinate selector 430 determines the desired coordinate 
frame of the transformation. For example, a first camera may be selected 
as the base camera and data from the other cameras are expressed in the 
coordinate frame of the base camera. A transformation processor 440 
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computes transformation parameters that convert data from a local 
coordinate frame to be expressed in the base coordinate frame. These 
transformation parameters are sent as output (box 450) of the calibration 
module 330. 

5 FIG. 5 is a block diagram illustrating the transformation processor 

440 the calibration module shown in FIG. 4. The transformation processor 
440 includes an interpolation module 510, for interpolating between data 
points, a data matching processor 520, for matching up data points from 
different cameras at a certain time, and an error minimization processor 
1 0 530, for determining the data points that yield the most accurate 
a transformation parameters. The transformation processor 440 inputs 
1 synchronized data from the data synchronizer 420. A time is then selected 
Z by the interpolation module 51 0 along with position data corresponding to 
pj that time. If there was no data point sampled by the data modules at the 
8 1 5 selected time then the interpolation module 51 0 interpolates a data point, 

0 as described further below. 

J| The data points at the selected time are received by the data 

|! matching processor 520. In addition, the data matching processor 520 

1 receives a desired coordinate frame as determined by the coordinate 

20 selector 430. The desired coordinate frame may be, for example, chosen 
by the user or selected at random. Any data from the cameras is 
expressed in the selected coordinate frame (also called the base 
coordinate frame). The data matching processor 520 matches data points 
at the selected time and computes transformation parameters using the 

25 data points. The error minimization processor 530 determines which data 
points give the most accurate transformation parameters. 
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V. Details of the Components and Operation 

FIG. 6 is a general flow diagram of the operation of the calibration 
module 330 of the present invention. Generally, data observed by 
cameras is received as input (box 610). Depending on the application 
5 there may be two or more cameras, with each camera positioned to 
observe data within a scene. Even if there are more than two cameras, 
however, the calibration method of the present invention only needs to 
measure the relative position and orientation between two cameras at a 
time. This is because one camera is designated as the base camera and 
1 0 the position and orientation of the remainder of the cameras are measured 
from that base camera. The raw data from each camera is sampled and 
Jj sent to the calibration module 330 and the sampled data from each 
1=' camera in its respective local coordinate frame. 
W One of the cameras is selected as the base camera and the 

W 15 coordinate frame of chosen camera becomes the base coordinate frame 
T (box 620). Transformation parameters are computed (box 630) from the 
la sampled data received by the calibration module 330. These 
S transformation parameters are then used to express data received from 
O each camera in the base coordinate frame. Once this calibration process 
20 is performed, any data observed by a non-base camera can be expressed 
in the base coordinate frame as if the data had been observed by the base 
camera. 

FIG. 7A-7C are general block diagrams illustrating the operations of 
the transformation parameters computed by the calibration module 330 
25 shown in FIG. 4. In FIG. 7A, a first camera 700 and a second camera 705 
observe an object path in a scene. A first object path 710 is observed by 
the first camera 700 in a first local coordinate frame and a second object 
path 715 is observed by the second camera in a second local coordinate 



14 



MS Docket No. 141382-1 



frame. It should be noted that two cameras 700, 705 observe the same 
path but in different local coordinate frames. 

FIG. 7B illustrates a set of transformation parameters computed by 
the calibration module 330 applied to the data of the second camera 705. 
In particular, local coordinate frame of the first camera 700 has been 
selected as the base coordinate frame and, in accordance with the present 
invention, one purpose of the calibration module 330 is to compute 
transformation parameters that cause the second object path 715 to 
overlap with the first object path 71 0 as closely as possible. The 
transformation parameters include a change in the "x" coordinate (ax) 730, 
a change in the "y" coordinate (Ay) 735 and an angle of rotation (e) 740. 
As shown in FIG. 7B, when the transformation parameters (ax, Ay, e) are 
applied to the second object path 715, the first object path 710 and second 
object path 715 nearly overlap. The lack of exact overlap is due to a slight 
amount of error in the calculation of the transformation parameters. 

FIG. 7C illustrates another set of transformation parameters 
computed by the calibration module 330 applied to the data of the second 
camera 705. In FIG. 7C the transformation parameters ax* 750, Ay* 755 
and e* 760 are used to achieve an exact overlap of the first object path 
710 and the second object path 715 into a single object path 770. The 
exact overlap represents minimum error in the transformation parameters 
(ax*, Ay*, e*) and means that these transformation parameters can be 
used to express data from the second camera 705 in the base coordinate 
frame. 

FIG. 8 is a detailed flow diagram illustrating a preferred embodiment 
of the present invention. In this preferred embodiment, the present 
invention designates one of a plurality of cameras as a base camera and 
measures the relative pose of the remainder of the cameras with respect to 
the base camera. Initially, one camera is selected as a base camera and 
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that camera's local coordinate frame becomes the base coordinate frame 
(box 800). Moreover, data from each camera is received as input (box 
805). 

Before this data can be used to compute transformation parameters, 
however, at least two problems must be overcome. The first problem 
occurs if the clocks on the computers used to sample the data are 
unsynchronized by a constant time offset so that equivalent time readings 
on the computers do not correspond to the same actual time. The present 
invention corrects this problem by adding a time offset to the data. In 
particular, a time offset value is chosen (box 810) and applied to the 
camera data (box 815) in order to synchronize the data. The second 
problem occurs if the data from the cameras is not sampled at the same 
time leaving, for example, a data point at time f from a first camera without 
a corresponding data point from a second camera. 

The present invention corrects this problem by performing a linear 
interpolation (box 820) between two data points sampled before and after 
time t. This linear interpolation approximates where a data point would 
have been seen at time t. Next, data matching is performed to provide 
enough data points to compute the corresponding transformation 
parameters. Data matching matches data from different cameras at 
certain absolute times and uses these data points to compute 
transformation parameters. 

FIGS. 9A-9D illustrate an example of how the present invention can 
perform data matching. In particular, data from a first camera (camera 1) 
in a first local coordinate frame fa, yi) and data from a second camera 
(camera 2) in a second coordinate frame (x 2> y 2 ) are graphed as a function 
of time. FIG. 9A illustrates a graph of versus time, FIG. 9B illustrates a 
graph of versus time, FIG. 9C illustrates x 2 versus time and FIG. 9D 
illustrates y 2 versus time. A time T* is selected such that there is data 
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available at T* from, for example, camera 1 . At time T, the x : and y^ 
coordinates from camera 1 (assuming that camera 1 was selected) will be 
perfectly synchronized, but there may be no data (i.e. (x 2 , y 2 )) available 
from camera 2. In this situation, data from camera 1 at time T* is matched 
5 such that a first point 91 0 on the versus time graph (or a second point 
920 on the y-, versus time graph) are matched with a third point 930 on the 
x 2 versus time graph and a fourth point 940 on the y 2 versus time graph. 
Note that in FIGS. 9C and 9D there are no sampled data points from 
camera 2 at time T*. The present invention performs an interpolation and 
1 0 chooses sampled data points 950, 955 prior to T* and sampled data points 
960, 965 after T*. These sampled points 950, 955, 960, 965 are used to 
§ interpolate values of x 2 and y 2 at time T* to obtain the third and fourth data 
1 points 930, 940. Once the data has been matched at a certain time, the 
Si invention determines whether more data points are needed (box 830). If 
It 1 5 so then a different time is chosen and data matching is performed (box 
U 825) at that time. Otherwise, if there are enough data points, an error 
§ minimization technique is used to find the data points that give the 
5 transformation parameters with the least error (box 835). 
S For example, the present invention may determine minimum error by 

20 using a least squares technique that is discussed by S. Ma and Z. Zhang 
in "Computer Vision" (Chinese Academy of Science, 1 998), the entire 
contents of which are hereby incorporated by reference. In a preferred 
embodiment, however, the present invention uses a least median of 
squares technique to determine minimum error. The least median of 
25 squares technique is more robust and less affected by data points that lie 
well away from the majority of data points. The least median of squares 
technique is discussed in detail by P.J. Rousseeuw and A.M. Leroy in 
"Robust Regression and Outlier Detection" (New York: John Wiley and 
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Sons, 1987), the entire contents of which are hereby incorporated by 
reference. 

When the transformation parameters with the least amount of error 
have been determined, they are stored along with the time offset value 
5 used to synchronize the data (box 840). Next, a determination is made 
whether more time offset points are needed (box 845). If more are 
needed, then another time offset value is selected (box 850) and the 
process begins again at box 815. Otherwise, an error minimization 
technique is used to find the time offset value with the least amount of 
1 0 error (box 855). As before, the least median of squares technique is a 
preferred technique to determine the minimum error. 
5 FIG. 1 0 illustrates an example of how the present invention can 

Ml 

determine an accurate time offset. Specifically, the error for each time 
S3 offset values is determined and plotted as shown in FIG. 10. A point 1000 
m ■ 1 5 at which a minimum error occurs a corresponding time offset value is 
1, noted. When the time offset value at minimum error is determined, both 
I* the time offset value and the corresponding transformation parameters are 
{j sent as output (box 860). 

X It should be noted that in a preferred embodiment the transformation 

20 parameters are changes in the x and y coordinates and the rotation angle 
(such as ax, Ay, 0). In addition, other transformation parameters may be 
used depending on the type of coordinate systems used (such as, for 
example, polar coordinate systems). 

25 VI. Working Example 

The following working example uses a range imaging system to 
track the movement of a person around a room and is provided for 
illustrative purposes only. In this working example, the method and system 
of the present invention are used to calibrate two range cameras in prior to 
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using the range imaging system. As mentioned above, a variety of 
techniques (such as lasers and projected light patterns) are available for 
measuring range. Although in general the present invention is capable of 
using any ranging technique, in this working example stereo cameras were 
5 used. Stereo cameras were chosen because of their fast frame rate and 
because they are inexpensive and safe. In this working example, the 
application was tracking people as they move around a room. Further, two 
range cameras (camera 1 and camera 2) were used and calibrated based 
on a person's path when the person walked around the room. 
1 o The calibration process began by determining an (x,y) location of the 

person on a ground plane (in this working example, the floor of the room) 
2 as a function of time as measured by each range camera. This was 
1 accomplished using a technique described in co-pending U.S. patent 
| application serial number 09/455,822 entitled "A System and Process for 
Hi 1 5 Locating and Tracking a Person or Object in a Scene Using a Series of 
* ' Range Images" by Barry Brumitt, filed on December 6, 1 999, the entire 
J contents of which are hereby incorporated by reference. The present 
S invention then chose a first camera as the base camera and designated 
1 the location measured by the base camera as (w,)and a corresponding 
20 point from a second (non-base) camera (camera 2) as (x 2 ,y 2 y. The present 
invention was used to calibrate the two cameras by computing the 
transformation parameters of an angle 0 and an offset (Ax,Ay) that made 
the following equation true: 
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Once the values of e and (Ax,Ay) were determined, using this equation, any 
point seen by camera 2, (x 2 ,y 2 ) could be transformed into the coordinates 
of camera 1 (*„>>,) . This means that a point as seen by camera 2 could be 
expressed in the local coordinate frame of camera 1 as if the point was 
5 actually seen by camera 1 . 

In order to synchronize the data coming from the first and second 
cameras, a time offset a* was used to correct for the fact that a clock on 
the computer associated with the first camera (clock 1) was not 
synchronized with a clock on the computer associated with the second 
1 0 camera (clock 2). Thus, the points from the first camera and the second 
| camera became (wuA) and (x 2J ,y 2j ,t 2j+ At), respectively. An initial guess 
r of the time offset At was chosen and a point from each camera was 
S! sampled. Because the sampled points from each camera did not exactly 
W match up with each other, the data obtained from the second camera was 
* 15 interpolated as follows. 

J First, for every point in from the first camera taken at time t u , two 

5 points from the second camera were found that were taken as close as 
PJ possible on either side of that time (i.e., points r and / were found such 
that t 2r +At< *„ < t 2f + At ). Next, a linear interpolation was performed on the 
20 two points from the second camera, (x 2r ,y 2r ) and (x 2f ,y 2f ) , to approximate 
where the point would have been had it been seen at time t v . If, for any 
point in the first data set, surrounding points in the second data set could 
not be found, that point in time was ignored. After ignoring such points and 
after interpolation, there was a set of corresponding (x,>o points that were 
25 designated as (w,**, , i * * * » • The time data in this data set was 
ignored because it made no difference in the subsequent computations. 
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Next, point matching was performed and the transformation 
parameters corresponding to the least squared error was selected. 
Specifically, in this working example the least median of squares technique 
was used because it is a robust method. This method was implemented 
5 by picking random pairs of corresponding points from the data set 

((x lk y u ,x 2k ,y 2k ), i <*<«). A pair of points was the minimum number needed 
to compute the candidate transformation parameters (i.e., e and (Ax,Ay))- 
The two pairs of randomly chosen points were 
(wU (w:»)> 04,j4), (wD . and the angle e was computed as: 

* * 
~ X 2a ~~ X \a 

^ 10 fyb=y*2b-ylb 

m Ax b Ax b + Ay b Ay b 

III AxjAxj + Ay^Ayj 



and the translation (Ai.Ay) was: 

O Ax = x* a -x^cosC^ + ^sinC^) 

O -Ay = y la - x" 2a sin(0) - y' 2a cos(0) 

This e and (Ax,Ay) served as a trial solution for the calibration 
1 5 problem based on the two randomly chosen pair of points. The solution 
was evaluated by computing a list of the squared errors between 
corresponding points: 

e k = " x u cos (0> + y»< sin ^) - + " *2* sin(^) - y 2k cos(0) - Ayf 

The quality of the solution was the median value of this list of squared 
20 errors. In this working example, our implementation, 1 00 random pairs of 
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corresponding points were chosen and the transformation parameters e 
and (Ax,Ay) that corresponded to the least median of squares were used. 

The least median of square technique was used as above to 
compute the best e and (Ax,Ay) for a whole series of values of a time offset 
value (At). Whichever At gave the minimum least median of squares was 
chosen as the best one, and the corresponding e and (a*,ajO were used for 

the final solution. 

As an alternative to the least median of square technique described 
above, a least square solution could have been used to determine a 
minimum error. The least squares solution to the calibration problem 
computes the transformation parameters e and {Ax,Ay) that minimize the 
sum of the Euclidean distances between corresponding points in 
(w,;>4,>4)> The angle, *, is given by 

iX*u -K)(y'u -K)-04 -KX4 -*■*)) 

tan(0) = ■ ■ — — 

The above equation depends on the following equation, which 
computes the centroids of the points from each camera 

\ 1 ( " * " * ^ 

n \k=\ k=\ J 
\ if n * " * ^ 

(*2*»>S) =- Z^'Z^* 

n \ k=\ k=l J 

The translation (Ax, Ay) is then given by 
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The e and (Ax,Ay) computed are the solution to the calibration problem. 

The quality (or amount of error) of the solution is given by the 
average squared distance between corresponding points: 
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Iff 



e 2 = -ijk'u - 4 cos(0) + y 2k sin(0) - Ax) 2 + (y* - x lk sin(0) - >4 cos(0) - Ayf] 

For a series of values of the time offset, At, the transformation parameters 
e, (A*, Ay) and e 2 are computed. The average squared distance between 
corresponding points, e 2 , will be a minimum for some value of a*. We take 

5 the corresponding values of e and (Ax,a>o at the minimum value of At as 
the solution to the calibration problem. 

This least squares solution works well in spite of small errors in 
tracking the position of the person in the room. However, there can be 
outlier points due to gross errors in the process that tracks the person. 

1 0 These outlier points are (x, y ) locations that deviate greatly from the actual 
location of the person. In this case, the least squares solution will be 
drawn away from the right answer, and the a technique that is robust to 
such errors should be used, such as the least median of square technique 
described above. 

1 5 The foregoing description of the preferred embodiments of the 

invention has been presented for the purposes of illustration and 
description. It is not intended to be exhaustive or to limit the invention to 
the precise form disclosed. Many modifications and variations are possible 
in light of the above teaching. It is intended that the scope of the invention 

20 be limited not by this detailed description of the invention, but rather by the 
claims appended hereto. 
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WHA T IS CLAIMED IS : 

A method of determining a relative position and orientation 
between a base camera and a non-base camera, comprising: 
5 measuring a path of an object with the base camera in a base 

coordinate frame; 

measuring the object path with the non-base camera in a non- 
base coordinate frame; 

calculating transformation parameters based on the object 

10 path; 

applying the transformation parameters to the object path 
measured by the non-base camera such that that the object path 
measured by the non-base camera may be expressed in the base 
coordinate frame. 



W 15 



J? 



2. The method of claim 1 , wherein the object path is a person 
moving around a scene. 

3. The method of claim 1 , wherein calculating transformation 
20 parameters comprises performing matching of data measured by the base 

and non-base cameras. 

4. The method of claim 3, wherein data matching is used to 
solve a set of transformation equations. 



25 



5. The method of claim 4, wherein data matching comprises 
selecting a time value and matching points of the object path as measured 
by the base camera at the time value with points of the object path as 
measured by the non-base camera at the time value. 
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6. The method of claim 5, wherein interpolation is used to 
generate a data point at the time value if no data point was measured at 
the time value. 

5 

7. The method of claim 3, further comprising using an error 
minimization technique to determine transformation parameters with the 
least amount of error. 

1 o 8. The method of claim 7, wherein the error minimization 

technique is a least squares solution. 

■sis? 

•J 

9. The method of claim 7, wherein the error minimization 

D technique is a least median of squares solution. 

I 15 

r 1 0. The method of claim 3, further comprising applying a time 

5 offset to data from at least one of the base and non-base cameras to 

S correct for unsynchronized data between the base and non-base cameras. 

^20 11. The method of claim 1 0, wherein a set of time offset value 

and corresponding transformation parameters are calculated and an error 
minimization technique is used to determine the time offset value with the 
least amount of error. 

25 A method of measurin 9 a relative P ose between two 

cameras, comprising: 

selecting a time offset value corresponding to a time shift 

between the two cameras; and 
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calculating a transformation parameter using the time offset 
value capable of transforming data in a coordinate frame of one of the two 
cameras into a coordinate frame of the other of the two cameras so as to 
obtain the relative pose. 

13. The method of claim 12, further comprising applying the time 
offset value to data from at least one of the two cameras. 



14. The method of claim 1 3, wherein the data are measurements 
1 0 by the two cameras of a path of an object. 

1 5. The method of claim 1 3, wherein a plurality of time offset 
values are selected and a corresponding transformation parameter is 

S calculated for each of the plurality of time offset values. 

l_ 16. The method of claim 1 5, wherein one of the plurality of time 

1 offset values is chosen as a most correct time offset based on an error 
ip function. 

20 17. The method of claim 16, wherein the error function includes a 

least squares solution. 

1 8. The method of claim 1 6, wherein the error function includes a 
least median of square technique. 

25 



^3 




f. A method of calibrating a first and a second range camera, 
comprising: 

measuring a path of an object with the first range camera to 
generate a first observed object path; 
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measuring the object path with the second range camera to 
generate a second observed object path; and 

calculating a transformation parameter that causes the first 
observed object path to approximately overlap with the second observed 
5 object path so as to determine a relative pose between the first and 
second range cameras. 

20. The method of claim 19, wherein the transformation parameter 
is calculated using a time offset value. 




A calibration system for calibrating range cameras, 
a data input module that can transmit data measured by the 



jaw 

W range cameras; 

its 

12 1 5 a data synchronizer that can synchronize the data from each 

J* 1 of the range cameras; 

Q a data matching processor that can match the synchronized 

0 data from each of the range cameras; and 

jj If; 

5j an error 'minimization processor that can use the 

20 synchronized data to compute a transformation parameter having a 
minimum error. 

22. The calibration system of claim 21 , wherein the data is 
obtained from the range cameras measuring a path of an object in a 

25 scene. 

23. The calibration system of claim 21 , wherein the data 
synchronizer uses a time offset value to synchronize the data and the time 
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offset value is used by the error minimization processor to compute the 
transformation parameter. 

24. The calibration system of claim 21 , further comprising a 
coordinate selector that can select a base coordinate system for use in the 
calculation of the transformation parameter. 

25. The calibration system of claim 24, further comprising an 
interpolation module that can interpolate data for use in the data matching 
processor. 
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ABSTRACT OF THE DISCLOSURE 



A method and a system for measuring a relative position and 
orientation of range cameras using a movement of an object within a 
scene. In general, the present invention determines the relative pose 
between two cameras by measuring a path the movement of the object 

1 0 makes within a scene and calculating transformation parameters based on 
these measurements. These transformation parameters are used to 
determine the relative position of each camera with respect to a base 
camera. In a preferred embodiment, the present invention also includes 
other novel features such as a data synchronization feature that uses a 

15 time offset between cameras to obtain the transformation parameters. In 
addition, the present invention includes a technique that improves the 
robustness and accuracy of solving for the transformation parameters and 
an interpolation process that interpolates between sampled points if there 
is no data at a particular instant in time. Further, the present invention 

20 includes a system for determining a relative position and orientation of 
range cameras using spatial movement that incorporates the method of 
the present invention. 

EXPRESS MAIL CERTIFICATE UNDER 37 C.F.R. §1.10 

Express Mail mailing number EK 4770 1 0376US 

Date of Deposit: April 5, 2000 

I hereby certify that this paper or fee is being deposited with the United 
States Postal Service "Express Mail Post Office to Addressee" service 
under 37 C.F.R. §1 .10 on the date indicated above and is addressed to 




Signature: 
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DECLARATION AND POWER OF ATTORNEY FOR PATENT APPLICATION 



As a below named inventor, I hereby declare that: 

My residence, post office address and citizenship are as stated below next to my name. I believe I am 
the original, first and sole inventor (if only one name is listed below) or an original, first and joint inventor 
(if plural names are listed below) of the subject matter which is claimed and for which a patent is sought 
on the invention entitled: 

RELATIVE RANGE CAMERA CALIBRATION 



■ | ] divisional 
Sf ] continuation 
*"\ ] continuation-in-part 

tlil specification of which 
(cjseck one) XX_ is attached hereto 
LB was filed on 



I hereby state that I have reviewed and understand the contents of the above-identified specification, 
including the claims, as amended by any amendment referred to above. 

I acknowledge the duty to disclose information which is material to the examination of this application in 
accordance with Title 37, Code of Federal Regulations §1 .56(a). 

I do not know and do not believe that the invention was ever known or used in 
the United States of America before my or our invention thereof; 

I do not know and do not believe that the invention was ever patented or 

described in any printed publication in any country before my or our invention thereof or more 
than one year prior to this application; 

I do not know and do not believe that the invention was in public use or on 

sale in the United States of America more than one year prior to this application. 



TMs declaration is of the following type: 




JfX] original 



Application Serial No. 
and was amended on 



(if applicable) 
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The invention has not been patented or made the subject of any inventor's 

certificate issued before the date of this application in any country foreign to the United States of 
America on an application filed by me or my legal representatives or assigns more than twelve 
months prior to this application; and 

As to applications for patents or inventor's certificate on the invention 

filed in any country foreign to the United States of America, prior to this application by me or my 

legal representatives or assign: 

XX no such applications have been filed, or 

such applications have been filed as follows: 



PRIORITY CLAIM (35 U.S.C. §119) 

We hereby claim foreign priority benefits under Title 35, United States Code, §119, of any foreign 
application(s) for patent or inventor's certificate or of any PCT international application(s) designating at 
least one country other than the United States of America listed below, and have also identified below 
aw foreign application(s) for patent or inventor's certificate or any PCT international application(s) 
designating at least one country other than the United States of America filed by me on the same 
si|)ject matter having a filing date before that of the application(s) of which priority is claimed. 

AS Prior foreign/PCT application(s) filed within 12 mos. (6 mos. for design) prior to this 
application, and any priority claims under 35 U.S.C. §119 

Countrv/PCT Application No. Date Filed Priority Claimed 

(i)NE MYes []No 

*S [ ] Yes [ ] No 

£j [ ]Yes [ ]No 

£ All foreign application(s), if any, filed more than 12 mos. (6 mos. for design) prior to this 
U.S. application 

Country: NONE 
Application No.: 
Filing Date: 

PRIOR U.S. APPLICATION(S) FOR WHICH BENEFIT 
UNDER 35 U.S.C. §11 9(e) or 35 U.S.C. §120 IS CLAIMED 

Serial No. Filing Date Patented Pending Abandoned Provisional Application 
NONE 
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